From NVIDIA Megatron-LM for visibility #18

RaymondLi0 · 2023-01-24T20:01:13Z

No description provided.

Expose TE fused MLP with module spec See merge request ADLR/megatron-lm!3384

Co-authored-by: root <[email protected]> Co-authored-by: William Dykas <[email protected]>

Moe inference functional tests See merge request ADLR/megatron-lm!3403

…on H100 Co-authored-by: Oliver Koenig <[email protected]>

ci: Benchmark release tests suite with TE2.2 on H100 See merge request ADLR/megatron-lm!3458

Move data to GPU for TP data processing See merge request ADLR/megatron-lm!3371

…, embedding tying" This reverts commit 5ae21f8.

Signed-off-by: oliver könig <[email protected]>

…nd fix shape mismatch between vision and language transformer Co-authored-by: Gao Deng <[email protected]> Co-authored-by: Gao Deng <[email protected]>

Optimize dummy weight tensors for cudagraph and fix shape mismatch between vision and language transformer See merge request ADLR/megatron-lm!3366

Add --enable-experimental to args. See merge request ADLR/megatron-lm!3377

Co-authored-by: Zijie Yan <[email protected]>

perf(MLA): MLA down proj switch back to TELinear See merge request ADLR/megatron-lm!3281

ci: Retry on network errors See merge request ADLR/megatron-lm!3463

Co-authored-by: Oliver Koenig <[email protected]> Co-authored-by: Guyue Huang <[email protected]> Co-authored-by: Guyue Huang <[email protected]>

Add TE functional tests See merge request ADLR/megatron-lm!3361

Signed-off-by: oliver könig <[email protected]>

…to TELinear" This reverts commit e63aee4.

Signed-off-by: oliver könig <[email protected]>

…ns to CPU Co-authored-by: Selvaraj Anandaraj <[email protected]> Co-authored-by: Selvaraj Anandaraj <[email protected]>

Added support for offloading Swiglu activations to CPU See merge request ADLR/megatron-lm!3024

Force inference to always gather logits with tensor parallelism See merge request ADLR/megatron-lm!3442

…erate tokens

Only run prefill for requests that do not generate tokens See merge request ADLR/megatron-lm!3499

Co-authored-by: Cyril Meurillon <[email protected]> Co-authored-by: Cyril Meurillon <[email protected]>

Enable reruns by default See merge request ADLR/megatron-lm!2739

…lidation feature Co-authored-by: Ye Yu <[email protected]> Co-authored-by: Chenhan Yu <[email protected]>

Clean up ModelOpt finetune scripts and add validation feature See merge request ADLR/megatron-lm!3268

Fix typo in parallel_state expert parallelism See merge request ADLR/megatron-lm!3548

… layers per stage in flexible pp layout

Fix cuda graph logic to determine first/layer layers per stage in flexible pp layout See merge request ADLR/megatron-lm!3505

Remove extra barrier in checkpoint flow See merge request ADLR/megatron-lm!3626

Fix error when TE is not installed See merge request ADLR/megatron-lm!3625

…itializations and associated weight decay skipping.

Adding support for Spike No More embedding initializations and associated weight decay skipping. See merge request ADLR/megatron-lm!3500

MiMo video VLM train example See merge request ADLR/megatron-lm!3543

ci: Retry on `free(): invalid pointer` See merge request ADLR/megatron-lm!3632

Signed-off-by: oliver könig <[email protected]>

Co-authored-by: Keshav Santhanam <[email protected]> Co-authored-by: William Dykas <[email protected]>

Add Dynamic Backend Inference Tests See merge request ADLR/megatron-lm!3475

…ate loading with PP>1 to ensure bit-wise match after saving and loading.

fix(distckpt, moe): Fix distckpt optimizer state loading with PP>1 to ensure bit-wise match after saving and loading. See merge request ADLR/megatron-lm!3394

tests: Fix segfaults (maybe?) See merge request ADLR/megatron-lm!3605

Co-authored-by: liu-zichen <[email protected]>

Fix mrope with context parallel See merge request ADLR/megatron-lm!3603

RaymondLi0 changed the base branch from multi-query-attention to before-merge June 20, 2023 20:12

RaymondLi0 changed the base branch from before-merge to multi-query-attention June 20, 2023 20:12

ko3n1g and others added 28 commits June 11, 2025 02:53

Merge branch 'mfutrega/fused_swiglu' into 'main'

0595ef2

Expose TE fused MLP with module spec See merge request ADLR/megatron-lm!3384

ADLR/megatron-lm!3403 - Moe inference functional tests

9e5fe7a

Co-authored-by: root <[email protected]> Co-authored-by: William Dykas <[email protected]>

Merge branch 'moe-tests' into 'main'

0dea9a5

Moe inference functional tests See merge request ADLR/megatron-lm!3403

ADLR/megatron-lm!3458 - ci: Benchmark release tests suite with TE2.2 …

80d66ec

…on H100 Co-authored-by: Oliver Koenig <[email protected]>

Merge branch 'ko3n1g/chore/release-benchmarks-dev' into 'main'

a3e2222

ci: Benchmark release tests suite with TE2.2 on H100 See merge request ADLR/megatron-lm!3458

ADLR/megatron-lm!3371 - Move data to GPU for TP data processing

15e4446

Merge branch 'pmannan/improve_data_processing' into 'main'

d58f062

Move data to GPU for TP data processing See merge request ADLR/megatron-lm!3371

Reapply "ADLR/megatron-lm!3399 - [MM] [Bug Fix] model parameter dtype…

f5cfc10

…, embedding tying" This reverts commit 5ae21f8.

update golden values

5bb6cf3

Signed-off-by: oliver könig <[email protected]>

ADLR/megatron-lm!3366 - Optimize dummy weight tensors for cudagraph a…

603592a

…nd fix shape mismatch between vision and language transformer Co-authored-by: Gao Deng <[email protected]> Co-authored-by: Gao Deng <[email protected]>

Merge branch 'gaod/llama4/cudagraph_optimize' into 'main'

40bfaf5

Optimize dummy weight tensors for cudagraph and fix shape mismatch between vision and language transformer See merge request ADLR/megatron-lm!3366

ADLR/megatron-lm!3377 - Add --enable-experimental to args.

6782fe4

Merge branch 'denliu/add_enable_experimental_flag' into 'main'

32737be

Add --enable-experimental to args. See merge request ADLR/megatron-lm!3377

ADLR/megatron-lm!3281 - perf(MLA): MLA down proj switch back to TELinear

e63aee4

Co-authored-by: Zijie Yan <[email protected]>

Merge branch 'mla_down_proj_telinear' into 'main'

ae63c41

perf(MLA): MLA down proj switch back to TELinear See merge request ADLR/megatron-lm!3281

ADLR/megatron-lm!3463 - ci: Retry on network errors

9042182

Merge branch 'ko3n1g/ci/wait-resources-resiliency' into 'main'

819f752

ci: Retry on network errors See merge request ADLR/megatron-lm!3463

ADLR/megatron-lm!3361 - Add TE functional tests

b8605c6

Co-authored-by: Oliver Koenig <[email protected]> Co-authored-by: Guyue Huang <[email protected]> Co-authored-by: Guyue Huang <[email protected]>

Merge branch 'ko3n1g/guyueh/te_functional_tests' into 'main'

107fc72

Add TE functional tests See merge request ADLR/megatron-lm!3361

revert

effa991

Signed-off-by: oliver könig <[email protected]>

ci: Restart on cuda error

ad7d1df

Signed-off-by: oliver könig <[email protected]>

Revert "ADLR/megatron-lm!3281 - perf(MLA): MLA down proj switch back …

f21a28b

…to TELinear" This reverts commit e63aee4.

Merge branch 'ko3n1g/ci/restart-on-cuda'

a4fc916

Merge branch 'ko3n1g/chore/re-apply-3399'

7f7ffcf

ci: Set gpt-nemo tests as allowed to fail

73558db

Signed-off-by: oliver könig <[email protected]>

ci: Fix while loop

42f7f7f

Signed-off-by: oliver könig <[email protected]>

ADLR/megatron-lm!3024 - Added support for offloading Swiglu activatio…

0bbcbb1

…ns to CPU Co-authored-by: Selvaraj Anandaraj <[email protected]> Co-authored-by: Selvaraj Anandaraj <[email protected]>

Merge branch 'swiglu_offload' into 'main'

fdcf52b

Added support for offloading Swiglu activations to CPU See merge request ADLR/megatron-lm!3024

ko3n1g and others added 30 commits July 15, 2025 00:09

Merge branch 'runtime_gather_output_fix' into 'main'

7c9cdcb

Force inference to always gather logits with tensor parallelism See merge request ADLR/megatron-lm!3442

ADLR/megatron-lm!3499 - Only run prefill for requests that do not gen…

14a0250

…erate tokens

Merge branch 'score_and_return_for_echo' into 'main'

daaa650

Only run prefill for requests that do not generate tokens See merge request ADLR/megatron-lm!3499

ADLR/megatron-lm!2739 - Enable reruns by default

ebcd1d9

Co-authored-by: Cyril Meurillon <[email protected]> Co-authored-by: Cyril Meurillon <[email protected]>

Merge branch 'rerun-functests' into 'main'

c031fda

Enable reruns by default See merge request ADLR/megatron-lm!2739

ADLR/megatron-lm!3268 - Clean up ModelOpt finetune scripts and add va…

488c1f2

…lidation feature Co-authored-by: Ye Yu <[email protected]> Co-authored-by: Chenhan Yu <[email protected]>

Merge branch 'yeyu/update_finetune' into 'main'

550ed52

Clean up ModelOpt finetune scripts and add validation feature See merge request ADLR/megatron-lm!3268

ADLR/megatron-lm!3548 - Fix typo in parallel_state expert parallelism

410dda3

Merge branch 'tde/fix_mpu_ep_typo' into 'main'

5783ff3

Fix typo in parallel_state expert parallelism See merge request ADLR/megatron-lm!3548

ADLR/megatron-lm!3505 - Fix cuda graph logic to determine first/layer…

d2be982

… layers per stage in flexible pp layout

Merge branch 'fix_cuda_graph_for_new_pp_layout' into 'main'

020d85e

Fix cuda graph logic to determine first/layer layers per stage in flexible pp layout See merge request ADLR/megatron-lm!3505

ADLR/megatron-lm!3626 - Remove extra barrier in checkpoint flow

d96c260

Merge branch 'ananthsub-remove-extra-barrier' into 'main'

26869fe

Remove extra barrier in checkpoint flow See merge request ADLR/megatron-lm!3626

ADLR/megatron-lm!3625 - Fix error when TE is not installed

76203e7

Merge branch 'fix_te_not_installed' into 'main'

e6c510f

Fix error when TE is not installed See merge request ADLR/megatron-lm!3625

ADLR/megatron-lm!3500 - Adding support for Spike No More embedding in…

479a42e

…itializations and associated weight decay skipping.

Merge branch 'jstjohn/spike-no-more' into 'main'

ee74aa6

Adding support for Spike No More embedding initializations and associated weight decay skipping. See merge request ADLR/megatron-lm!3500

ADLR/megatron-lm!3543 - MiMo video VLM train example

ead789f

Merge branch 'yash/mimo_video_llava_mr' into 'main'

786f562

MiMo video VLM train example See merge request ADLR/megatron-lm!3543

ADLR/megatron-lm!3632 - ci: Retry on free(): invalid pointer

3b173e5

Merge branch 'ko3n1g/ci/retry-on-invalid-pointer' into 'main'

ec35b41

ci: Retry on `free(): invalid pointer` See merge request ADLR/megatron-lm!3632

tests(hotfix): mimo_vlm_pretrain_convergence_tp1_pp1_cp1_dp8

f8d3e2e

Signed-off-by: oliver könig <[email protected]>

ADLR/megatron-lm!3475 - Add Dynamic Backend Inference Tests

3c20092

Co-authored-by: Keshav Santhanam <[email protected]> Co-authored-by: William Dykas <[email protected]>

Merge branch 'dynamic-inference-tests' into 'main'

a84756a

Add Dynamic Backend Inference Tests See merge request ADLR/megatron-lm!3475

ADLR/megatron-lm!3394 - fix(distckpt, moe): Fix distckpt optimizer st…

9a45638

…ate loading with PP>1 to ensure bit-wise match after saving and loading.

Merge branch 'fix_dist_ckpt_optim_load_bug' into 'main'

eb5ff48

fix(distckpt, moe): Fix distckpt optimizer state loading with PP>1 to ensure bit-wise match after saving and loading. See merge request ADLR/megatron-lm!3394

ADLR/megatron-lm!3605 - tests: Fix segfaults (maybe?)

46b35f5

Merge branch 'ko3n1g/tests/segfaults' into 'main'

f6492fe

tests: Fix segfaults (maybe?) See merge request ADLR/megatron-lm!3605

ADLR/megatron-lm!3603 - Fix mrope with context parallel

28778d1

Co-authored-by: liu-zichen <[email protected]>

Merge branch 'lit/fix_mope_with_cp' into 'main'

015542d

Fix mrope with context parallel See merge request ADLR/megatron-lm!3603

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

From NVIDIA Megatron-LM for visibility #18

From NVIDIA Megatron-LM for visibility #18

Uh oh!

RaymondLi0 commented Jan 24, 2023

Uh oh!

Uh oh!

From NVIDIA Megatron-LM for visibility #18

Are you sure you want to change the base?

From NVIDIA Megatron-LM for visibility #18

Uh oh!

Conversation

RaymondLi0 commented Jan 24, 2023

Uh oh!

Uh oh!